Distributed and Adaptive Execution of Condor DAGMan Workflows
نویسندگان
چکیده
Large-scale applications, in the form of workflows, may require the coordinated usage of resources spreading across multiple administrative domains. Scalable solutions need a decentralized approach to coordinate the execution of such workflows. At runtime, adjustments to the workflow execution plan may be required to meet Quality of Service objectives. In this paper, we provide a decentralized execution approach to large-scale workflows on different resource domains. We also provide a low overhead, decentralized runtime adaptation mechanism to improve the performance of the system. Our prototype implementation is based on standard Condor DAGMan workflow execution engine and does not require any modifications to Condor or its underlying system.
منابع مشابه
1 Workflow Management in Condor
The Condor Project began in 1988 and has evolved into a feature-rich batch system that targets high-throughput computing; that is, Condor focuses on providing reliable access to computing over long periods of time, instead of highly-tuned, high-performance computing for short periods of time or small numbers of applications. Many Condor users have not only long-running jobs, but have complex se...
متن کاملPegasus and DAGMan From Concept to Execution: Mapping Scientific Workflows onto Today's Cyberinfrastructure
In this chapter we describe an end-to-end workflow management system that enables scientists to describe their large-scale analysis in abstract terms, then maps and executes the workflows in an efficient and reliable manner on distributed resources. We describe Pegasus and DAGMan and various workflow restructuring and optimizations they perform and demonstrate the scalability and reliability of...
متن کاملWorkflow Support for Complex Grid Applications: Integrated and Portal Solutions
In this paper we present a workflow solution to support graphically the design, execution, monitoring, and performance visualisation of complex grid applications. The described workflow concept can provide interoperability among different types of legacy applications on heterogeneous computational platforms, such as Condor or Globus based grids. The major design and implementation issues concer...
متن کاملExtending GTLAB Tag Libraries for Grid Workflows
Portlet-based Grid portals have become a crucial part of the cyberinfrastructure by providing component-based problem solving environments for scientists. Although portals aim to provide user-friendly environments with easy-to-use interfaces, the development of portals and their portlet components are time consuming. We aim to provide reusable components for rapid portlet development. Our appro...
متن کاملAtlas Data Challenge Production on Grid3
We describe the design and operational experience of the ATLAS production system as implemented for execution on Grid3 resources. The execution environment consisted of a number of grid-based tools: Pacman for installation of VDT-based Grid3 services and ATLAS software releases, the Capone execution service built from the Chimera/Pegasus virtual data system for directed acyclic graph (DAG) gene...
متن کامل